642 research outputs found

    A Support Vector Machine for the Discrimination of MicroRNA Precursors from Other Genomic Hairpin Structures

    Get PDF
    Motivation: MicroRNAs (miRNAs) are endogenous, small (~ 20 nt), single-stranded, non-coding RNAs (ncRNAs) that result from the nuclear and cytoplasmic processing of transcribed precursor hairpin structures. They are increasingly recognized as playing crucial roles as post-transcriptional antisense regulators of gene expression through regulation of mRNA stability or translational efficiency. miRNAs, first reported in Caenorhabditis elegans, have been identified in the genomes of most higher organisms, including worms, flies, plants, mammals and recently in viruses. Functional studies have shown that miRNAs play important roles in processes such as, cell proliferation, fat metabolism, apoptosis, neuronal cell fate, insulin secretion, haematopoietic differentiation and developmental regulation. The detection of homologs of known miRNAs through comparative genomic approaches has proved relatively tractable. However, the ab-initio prediction of miRNA precursors through computational methods poses several additional difficulties, not least the fact that not all thermodynamically plausible transcribed hairpins are processed to yield mature miRNAs. It has not until now been possible to identify conserved sequence or structural elements that define consensus recognition elements for the enzymes that process miRNA precursors. In the light of these observations we wished to develop and improve methods for the discrimination of true miRNA precursor hairpins from spurious hairpins Methods: We have developed a SVM (Support Vector Machine) that considers up to 74 features associated with the primary and secondary structures and thermodynamic characteristics of candidate hairpin structures. We use a standard heuristic approach to optimize combinations of features used and train the SVM with sets of characterized hairpin miRNA precursors and known non-miRNA hairpins. Results: Our SVM shows highly promising results in the discrimination of true miRNA precursors from \u201cspurious\u201d hairpins (typically around 95% sensitivity) in various species. In particular, our levels of false positive predictions appear to be low relative to comparable methods

    Detection of a-to-i rna editing in sars-cov-2

    Get PDF
    ADAR1-mediated deamination of adenosines in long double-stranded RNAs plays an important role in modulating the innate immune response. However, recent investigations based on metatranscriptomic samples of COVID-19 patients and SARS-COV-2-infected Vero cells have recovered contrasting findings. Using RNAseq data from time course experiments of infected human cell lines and transcriptome data from Vero cells and clinical samples, we prove that A-to-G changes observed in SARS-COV-2 genomes represent genuine RNA editing events, likely mediated by ADAR1. While the A-to-I editing rate is generally low, changes are distributed along the entire viral genome, are overrepresented in exonic regions, and are (in the majority of cases) nonsynonymous. The impact of RNA editing on virus–host interactions could be relevant to identify potential targets for therapeutic interventions

    Comparative genomics provides an operational classification system and reveals early emergence and biased spatio-temporal distribution of SARS-CoV-2

    Get PDF
    Effective systems for the analysis of molecular data are of fundamental importance for real-time monitoring of the spread of infectious diseases and the study of pathogen evolution. While the Nextstrain and GISAID portals offer widely used systems for the classification of SARS-CoV-2 genomes, both present relevant limitations. Here we propose a highly reproducible method for the systematic classification of SARS-CoV-2 viral types. To demonstrate the validity of our approach, we conduct an extensive comparative genomic analysis of more than 20,000 SARS-CoV-2 genomes. Our classification system delineates 12 clusters and 4 super-clusters in SARS-CoV-2, with a highly biased spatio-temporal distribution worldwide, and provides important observations concerning the evolutionary processes associated with the emergence of novel viral types. Based on the estimates of SARS-CoV-2 evolutionary rate and genetic distances of genomes of the early pandemic phase, we infer that SARS-CoV-2 could have been circulating in humans since August-November 2019. The observed pattern of genomic variability is remarkably similar between all clusters and super-clusters, being UTRs and the s2m element, a highly conserved secondary structure element, the most variable genomic regions. While several polymorphic sites that are specific to one or more clusters were predicted to be under positive or negative selection, overall, our analyses also suggest that the emergence of novel genome types is unlikely to be driven by widespread convergent evolution and independent fixation of advantageous substitutions. While, in the absence of rigorous experimental validation, several questions concerning the evolutionary processes and the phenotypic characteristics (increased/decreased virulence) remain open, we believe that the approach outlined in this study can be of relevance for the tracking and functional characterization of different types of SARS-CoV-2 genomes

    Comparative genomics suggests limited variability and similar evolutionary patterns between major clades of SARS-CoV-2

    Get PDF
    Phylogenomic analysis of SARS-CoV-2 as available from publicly available repositories suggests the presence of 3 prevalent groups of viral episomes (super-clades), which are mostly associated with outbreaks in distinct geographic locations (China, USA and Europe). While levels of genomic variability between SARS-CoV-2 isolates are limited, to our knowledge, it is not clear whether the observed patterns of variability in viral super-clades reflect ongoing adaptation of SARS-CoV-2, or merely genetic drift and founder effects. Here, we analyze more than 1100 complete, high quality SARS-CoV-2 genome sequences, and provide evidence for the absence of distinct evolutionary patterns/signatures in the genomes of the currently known major clades of SARS-CoV-2. Our analyses suggest that the presence of distinct viral episomes at different geographic locations are consistent with founder effects, coupled with the rapid spread of this novel virus. We observe that while cross species adaptation of the virus is associated with hypervariability of specific protein coding regions (including the RDB domain of the spike protein), the more variable genomic regions between extant SARS-CoV-2 episomes correspond with the 3\u2019 and 5\u2019 UTRs, suggesting that at present viral protein coding genes should not be subjected to different adaptive evolutionary pressures in different viral strains. Although this study can not be conclusive, we believe that the evidence presented here is strongly consistent with the notion that the biased geographic distribution of SARSCoV-2 isolates should not be associated with adaptive evolution of this novel pathogen

    Towards an integrated pipeline for the in-silico prediction of plant microRNAs and their precursors

    Get PDF
    MicroRNAs (miRNAs) are endogenous, small (~ 20 nt), single-stranded, non-coding RNAs that result from the processing of transcribed precursor hairpin structures. They are increasingly recognized as playing crucial roles as post-transcriptional antisense regulators of gene expression through regulation of mRNA stability or translational efficiency. The detection of homologs of known miRNAs through comparative genomic approaches has proved relatively tractable. However, the ab-initio prediction of potentially lineage-specific miRNA precursors through computational methods poses several additional difficulties, not least the fact that not all thermodynamically plausible transcribed hairpins are processed to yield mature miRNAs. We have developed a Support Vector Machine that considers up to 78 features associated with the primary and secondary structures and thermodynamic characteristics of candidate hairpin structures. Our SVM is highly specific in the discrimination of true miRNA precursors from “spurious” hairpins with levels of false positive predictions that are low relative to comparable methods. We also show how our SVM functions as part of an in-silico pipeline for the prediction of novel miRNA precursors in plant genomes

    Laniakea : an open solution to provide Galaxy "on-demand" instances over heterogeneous cloud infrastructures

    Get PDF
    Background: While the popular workflow manager Galaxy is currently made available through several publicly accessible servers, there are scenarios where users can be better served by full administrative control over a private Galaxy instance, including, but not limited to, concerns about data privacy, customisation needs, prioritisation of particular job types, tools development, and training activities. In such cases, a cloud-based Galaxy virtual instance represents an alternative that equips the user with complete control over the Galaxy instance itself without the burden of the hardware and software infrastructure involved in running and maintaining a Galaxy server. Results: We present Laniakea, a complete software solution to set up a \u201cGalaxy on-demand\u201d platform as a service. Building on the INDIGO-DataCloud software stack, Laniakea can be deployed over common cloud architectures usually supported both by public and private e-infrastructures. The user interacts with a Laniakea-based service through a simple front-end that allows a general setup of a Galaxy instance, and then Laniakea takes care of the automatic deployment of the virtual hardware and the software components. At the end of the process, the user gains access with full administrative privileges to a private, production-grade, fully customisable, Galaxy virtual instance and to the underlying virtual machine (VM). Laniakea features deployment of single-server or cluster-backed Galaxy instances, sharing of reference data across multiple instances, data volume encryption, and support for VM image-based, Docker-based, and Ansible recipe-based Galaxy deployments. A Laniakea-based Galaxy on-demand service, named Laniakea@ReCaS, is currently hosted at the ELIXIR-IT ReCaS cloud facility. Conclusions: Laniakea offers to scientific e-infrastructures a complete and easy-to-use software solution to provide a Galaxy on-demand service to their users. Laniakea-based cloud services will help in making Galaxy more accessible to a broader user base by removing most of the burdens involved in deploying and running a Galaxy service. In turn, this will facilitate the adoption of Galaxy in scenarios where classic public instances do not represent an optimal solution. Finally, the implementation of Laniakea can be easily adapted and expanded to support different services and platforms beyond Galaxy

    Stem cell impairment at the host–microbiota interface in colorectal cancer

    Get PDF
    Colorectal cancer (CRC) initiation is believed to result from the conversion of normal intestinal stem cells (ISCs) into cancer stem cells (CSCs), also known as tumor-initiating cells (TICs). Hence, CRC evolves through the multiple acquisition of well-established genetic and epigenetic alterations with an adenoma–carcinoma sequence progression. Unlike other stem cells elsewhere in the body, ISCs cohabit with the intestinal microbiota, which consists of a diverse community of microorganisms, including bacteria, fungi, and viruses. The gut microbiota communicates closely with ISCs and mounting evidence suggests that there is significant crosstalk between host and microbiota at the ISC niche level. Metagenomic analyses have demonstrated that the host– microbiota mutually beneficial symbiosis existing under physiologic conditions is lost during a state of pathological microbial imbalance due to the alteration of microbiota composition (dysbiosis) and/or the genetic susceptibility of the host. The complex interaction between CRC and microbiota is at the forefront of the current CRC research, and there is growing attention on a possible role of the gut microbiome in the pathogenesis of CRC through ISC niche impairment. Here we primarily review the most recent findings on the molecular mechanism underlying the complex interplay between gut microbiota and ISCs, revealing a possible key role of microbiota in the aberrant reprogramming of CSCs in the initiation of CRC. We also discuss recent advances in OMICS approaches and single-cell analyses to explore the relationship between gut microbiota and ISC/CSC niche biology leading to a desirable implementation of the current precision medicine approaches

    Laniakea@ReCaS: an ELIXIR-ITALY Galaxyon-demand cloud service

    Get PDF
    Although several Galaxy public services are available, a private Galaxy instance is still mandatory or preferable for several use cases including heavy workloads, data privacy concerns or particular customization needs. Cloud computing technologies provide a viable way to deploy Galaxy private instances, freeing users from the onerous deployment and maintenance of local IT infrastructures. In the last few years, ELIXIR-IT led the development of Laniakea, a software framework that facilitates the provisioning of on-demand Galaxy instances as a cloud service over e-infrastructures. The user interacts with a Laniakea service through a web front-end that allows to configure and launch a production-grade Galaxy instance in a straightforward way. Through the interface, the user can deploy Galaxy instances over single VMs or virtual clusters, link them to shared reference data volumes and plain or encrypted volumes for storing data. A selection of \u201cflavours\u201d, that is Galaxy instances pre-configured with sets of tools for specific tasks, is also available. When the users is satisfied, Laniakea takes oved and deploys the desired Galaxy instance over the cloud, providing a public IP and full administrative privileges over the new instance. In Dec-2018, we launched the beta-test phase of the first Laniakea-based Galaxy on-demand ELIXIR-IT service: Laniakea@ReCaS. After six months of helpful testing, we are now ready to announce the production phase of this service. Access to the service will be provided on a per-project basis through an open-ended call defining terms and conditions, project proposals will be evaluated by a scientific and technical board. Accepted proposals will be granted a package of computational resources for running on-demand Galaxy instances for a duration compatible with the project requirements

    Investigating human mitochondrial genomes in single cells

    Get PDF
    Mitochondria host multiple copies of their own small circular genome that has been extensively studied to trace the evolution of the modern eukaryotic cell and discover important mutations linked to inherited diseases. Whole genome and exome sequencing have enabled the study of mtDNA in a large number of samples and experimental conditions at single nucleotide resolution, allowing the deciphering of the relationship between inherited mutations and phenotypes and the identification of acquired mtDNA mutations in classical mitochondrial diseases as well as in chronic disorders, ageing and cancer. By applying an ad hoc computational pipeline based on our MToolBox software, we reconstructed mtDNA genomes in single cells using whole genome and exome sequencing data obtained by different amplification methodologies (eWGA, DOP-PCR, MALBAC, MDA) as well as data from single cell Assay for Transposase Accessible Chromatin with high-throughput sequencing (scATAC-seq) in which mtDNA sequences are expected as a byproduct of the technology. We show that assembled mtDNAs, with the exception of those reconstructed by MALBAC and DOP-PCR methods, are quite uniform and suitable for genomic investigations, enabling the study of various biological processes related to cellular heterogeneity such as tumor evolution, neural somatic mosaicism and embryonic development

    Laniakea: a Galaxy-on-demand Provider Platform Through Cloud Technologies

    Get PDF
    Galaxy is rapidly becoming the de facto standard workflow manager for bioinformatics. Although several Galaxy public services are currently available, the usage of a private Galaxy instance is still mandatory or preferable for several use cases, including heavy workloads, data privacy concerns or particular customization needs. In this context, cloud computing technologies and infrastructures can provide a powerful and scalable solution to avoid the onerous deployment and maintenance of a local hardware and software infrastructure. Laniakea is a software framework that facilitates the provisioning of on-demand Galaxy instances as a cloud service over e-infrastructures, by leveraging on the open source software catalogue developed by the INDIGO-DataCloud H2020 project, which aimed to make cloud e-infrastructures more accessible by scientific communities. End-users interact with Laniakea through a web front-end that allows a general setup of a Galaxy instance. The deployment of the virtual hardware and of the Galaxy software ecosystem is subsequently performed by the INDIGO Platform as a Service layer. At the end of the process, the user gains access to a private, production-grade, fully customizable, Galaxy virtual instance. Laniakea features the deployment of a stand-alone or cluster backed Galaxy instances, shared reference data volumes, encrypted data volumes and rapid development of novel Galaxy flavours for specific tasks. We present here the latest development iteration of Laniakea, introducing a novel and strongly configurable web interface that facilitates a more straightforward customisation of the user experience through human readable YAML syntax and a reworked encryption procedure that exploits Hashicorp Vault as encryption keys management system
    • …
    corecore